18 research outputs found

    Templates as a method for implementing data provenance in decision support systems

    Get PDF
    AbstractDecision support systems are used as a method of promoting consistent guideline-based diagnosis supporting clinical reasoning at point of care. However, despite the availability of numerous commercial products, the wider acceptance of these systems has been hampered by concerns about diagnostic performance and a perceived lack of transparency in the process of generating clinical recommendations. This resonates with the Learning Health System paradigm that promotes data-driven medicine relying on routine data capture and transformation, which also stresses the need for trust in an evidence-based system. Data provenance is a way of automatically capturing the trace of a research task and its resulting data, thereby facilitating trust and the principles of reproducible research. While computational domains have started to embrace this technology through provenance-enabled execution middlewares, traditionally non-computational disciplines, such as medical research, that do not rely on a single software platform, are still struggling with its adoption. In order to address these issues, we introduce provenance templates – abstract provenance fragments representing meaningful domain actions. Templates can be used to generate a model-driven service interface for domain software tools to routinely capture the provenance of their data and tasks. This paper specifies the requirements for a Decision Support tool based on the Learning Health System, introduces the theoretical model for provenance templates and demonstrates the resulting architecture. Our methods were tested and validated on the provenance infrastructure for a Diagnostic Decision Support System that was developed as part of the EU FP7 TRANSFoRm project

    A comparison of machine learning techniques for detection of drug target articles

    Get PDF
    Important progress in treating diseases has been possible thanks to the identification of drug targets. Drug targets are the molecular structures whose abnormal activity, associated to a disease, can be modified by drugs, improving the health of patients. Pharmaceutical industry needs to give priority to their identification and validation in order to reduce the long and costly drug development times. In the last two decades, our knowledge about drugs, their mechanisms of action and drug targets has rapidly increased. Nevertheless, most of this knowledge is hidden in millions of medical articles and textbooks. Extracting knowledge from this large amount of unstructured information is a laborious job, even for human experts. Drug target articles identification, a crucial first step toward the automatic extraction of information from texts, constitutes the aim of this paper. A comparison of several machine learning techniques has been performed in order to obtain a satisfactory classifier for detecting drug target articles using semantic information from biomedical resources such as the Unified Medical Language System. The best result has been achieved by a Fuzzy Lattice Reasoning classifier, which reaches 98% of ROC area measure.This research paper is supported by Projects TIN2007-67407- C03-01, S-0505/TIC-0267 and MICINN project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I + D + i), as well as for the Juan de la Cierva program of the MICINN of SpainPublicad

    Automatic Drug-Drug Interaction Detection: A Machine Learning Approach With Maximal Frequent Sequence Extraction

    Full text link
    [EN] A Drug-Drug Interaction (DDI) occurs when the effects of a drug are modified by the presence of other drugs. DDIExtraction2011 proposes a first challenge task, Drug-Drug Interaction Extraction, to compare different techniques for DDI extraction and to set a benchmark that will enable future systems to be tested. The goal of the competition is for every pair of drugs in a sentence, decide whether an interaction is being described or not. We built a system based on machine learning based on bag of words and pattern extraction. Bag of words and other drug-level and character-level have been proven to have a high discriminative power for detecting DDI, while pattern extraction provided a moderated improvement indicating a good line for further research.This work has been done in the framework of the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems. Contributions of first and second authors have been supported and partially funded by bitsnbrains S.L. Contribution of fourth author has been partially funded by the European Commission as part of the WIQEI IRSES project (grant no. 269180) within the FP 7 Marie Curie People Framework, by MICINN as part of the Text-Enterprise 2.0 project (TIN2009-13391-C04-03) within the Plan I+D+i. Computational resources for this research have been kindly provided by Daniel Kuehn from [email protected]ía Blasco, S.; Mola Velasco, SM.; Danger Mercaderes, RM.; Rosso, P. (2011). Automatic Drug-Drug Interaction Detection: A Machine Learning Approach With Maximal Frequent Sequence Extraction. CEUR Workshop Proceedings. 761:51-58. http://hdl.handle.net/10251/33478S515876

    Towards a Protein-Protein Interaction information extraction system: recognizing named entities

    Full text link
    [EN] The majority of biological functions of any living being are related to Protein Protein Interactions (PPI). PPI discoveries are reported in form of research publications whose volume grows day after day. Consequently, automatic PPI information extraction systems are a pressing need for biologists. In this paper we are mainly concerned with the named entity detection module of PPIES (the PPI information extraction system we are implementing) which recognizes twelve entity types relevant in PPI context. It is composed of two sub-modules: a dictionary look-up with extensive normalization and acronym detection, and a Conditional Random Field classifier. The dictionary look-up module has been tested with Interaction Method Task (IMT), and it improves by approximately 10% the current solutions that do not use Machine Learning (ML). The second module has been used to create a classifier using the Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA 04) data set. It does not use any external resources, or complex or ad hoc post-processing, and obtains 77.25%, 75.04% and 76.13 for precision, recall, and F1-measure, respectively, improving all previous results obtained for this data set.This work has been funded by MICINN, Spain, as part of the "Juan de la Cierva" Program and the Project DIANA-Applications (TIN2012-38603-C02-01), as well as the by the European Commission as part of the WIQ-EI IRSES Project (Grant No. 269180) within the FP 7 Marie Curie People Framework.Danger Mercaderes, RM.; Pla Santamaría, F.; Molina Marco, A.; Rosso, P. (2014). Towards a Protein-Protein Interaction information extraction system: recognizing named entities. Knowledge-Based Systems. 57:104-118. https://doi.org/10.1016/j.knosys.2013.12.010S1041185

    Differential Privacy: What is all the noise about?

    Full text link
    Differential Privacy (DP) is a formal definition of privacy that provides rigorous guarantees against risks of privacy breaches during data processing. It makes no assumptions about the knowledge or computational power of adversaries, and provides an interpretable, quantifiable and composable formalism. DP has been actively researched during the last 15 years, but it is still hard to master for many Machine Learning (ML)) practitioners. This paper aims to provide an overview of the most important ideas, concepts and uses of DP in ML, with special focus on its intersection with Federated Learning (FL).Comment: 27 pages, 7 figure

    Protein-Protein Interaction text analysis

    Get PDF
    Por su enorme interés en la genética, medicina y farmacología, detectar interacciones entre proteínas (PPI) es una de las áreas de investigación más importantes en el campo de las investigaciones biomédicas. De ahí que revista especial importancia el análisis (semi) automático de textos biomédicos que permita recuperar y mantener descripciones experimentales que justifican la presencia o ausencia tales interaccione. Tal es la finalidad del sistema on-line que se describe en el presente trabajo: dado un texto, se verifica su afinidad al tema sobre PPI, y varias entidades biomédicas son reconocidas y devueltas al usuario.Protein-Protein Interaction (PPI) is one of the most important fields in biomedical research due to its enormous interest in genetics, medicine and pharmacology. Therefore, (semi)-automatic analysis of biomedical texts is critical for recovering and maintaining experimental descriptions which justify the presence or absence of such interactions. In this paper, an automatic, on-line system is described which, given a text, verifies if it corresponds to a PPI article, and recognizes various types of entities associated to this research context.Este trabajo ha sido subvencionado por el proyecto TEXT-ENTERPRISE 2.0 (TIN2009-13391-C04-03) y por el programa “Juan de la Cierva” del Ministerio de Ciencia y Tecnología

    Generating complex ontology instances from documents

    No full text
    This paper presents a novel Information Extraction system able to generate complex instances from free text available on the Web. The approach is based on a non-monotonical processing over ontologies, and makes use of entity recognizers and disambiguators in order to adequately extract and combine instances and their relations. Experiments conducted over the archaeology research domain provide encouraging results in both efficiency and efficacy and suggest that the tool is suitable for its application on other similar Semantic Web resources. © 2009 Elsevier Inc. All rights reserved

    Access control and view generation for provenance graphs

    No full text
    Abstract Data provenance refers to the knowledge about data sources and operations carried out to obtain some piece of data. A provenance-enabled system maintains record of the interoperation of processes across different modules, stages and authorities to capture the full lineage of the resulting data, and typically allows data-focused audits using semantic technologies, such as ontologies, that capture domain knowledge. However, regulating access to captured provenance data is a non-trivial problem, since execution records form complex, overlapping graphs with individual nodes possibly being subject to different access policies. Applying traditional access control to provenance queries can either hide from the user the entire graph with nodes that had access to them denied, reveal too much information, or return a semantically invalid graph. An alternative approach is to answer queries with a new graph that abstracts over the missing nodes and fragments. In this paper, we present TACLP, an access control language for provenance data that supports this approach, together with an algorithm that transforms graphs according to sets of access restrictions. The algorithm produces safe and valid provenance graphs that retain the maximum amount of information allowed by the security model. The approach is demonstrated on an example of restricting access to a clinical trial provenance trace.</p
    corecore